Searching the conformational space for docking

In molecular modelling, docking is a method which predicts the preferred orientation of one molecule to another when bound together in a stable complex. In the case of protein docking, the search space consists of all possible orientations of the protein with respect to the ligand. Flexible docking in addition considers all possible conformations of the protein paired with all possible conformations of the ligand.[1]

With present computing resources, it is impossible to exhaustively explore these search spaces; instead, there are many strategies which attempt to sample the search space with optimal efficiency. Most docking programs in use account for a flexible ligand, and several attempt to model a flexible protein receptor. Each "snapshot" of the pair is referred to as a pose.

Contents

Molecular dynamics (MD) simulations

In this approach, proteins are typically held rigid, and the ligand is allowed to freely explore their conformational space. The generated conformations are then docked successively into the protein, and an MD simulation consisting of a simulated annealing protocol is performed. This is usually supplemented with short MD energy minimization steps, and the energies determined from the MD runs are used for ranking the overall scoring. Although this is a computer-expensive method (involving potentially hundreds of MD runs), it has some advantages: for example, no specialized energy/scoring functions are required. MD force-fields can typically be used to find poses that are reasonable and can be compared with experimental structures.

The Distance Constrained Essential Dynamics method (DCED) has been used to generate multiple structures for docking, called eigenstructures. This approach, although avoiding most of the costly MD calculations, can capture the essential motions involved in a flexible receptor, representing a form of coarse-grained dynamics.[2]

Shape-complementarity methods

The most common technique used in many docking programs, shape-complementarity methods focus on the match between the receptor and the ligand in order to find an optimal pose. Programs include DOCK,[3] FRED,[4] GLIDE,[5] SURFLEX,[6] eHiTS[7] and many more. Most methods describe the molecules in terms of a finite number of descriptors that include structural complementarity and binding complementarity. Structural complementarity is mostly a geometric description of the molecules, including solvent-accessible surface area, overall shape and geometric constraints between atoms in the protein and ligand. Binding complementarity takes into account features like hydrogen bonding interactions, hydrophobic contacts and van der Waals interactions to describe how well a particular ligand will bind to the protein. Both kinds of descriptors are conveniently represented in the form of structural templates which are then used to quickly match potential compounds (either from a database or from the user-given inputs) that will bind well at the active site of the protein. Compared to the all-atom molecular dynamics approaches, these methods are very efficient in finding optimal binding poses for the protein and ligand.

Genetic algorithms

Two of the most used docking programs belong to this class: GOLD[8] and AutoDock.[9] Genetic algorithms allow the exploration of a large conformational space – which is basically spanned by the protein and ligand jointly in this case – by representing each spatial arrangement of the pair as a “gene” with a particular energy. The entire genome thus represents the complete energy landscape which is to be explored. The simulation of the evolution of the genome is carried out by cross-over techniques similar to biological evolution, where random pairs of individuals (conformations) are “mated” with the possibility for a random mutation in the offspring. These methods have proven very useful in sampling the vast state-space while maintaining closeness to the actual process involved.

Although genetic algorithms are quite successful in sampling the large conformational space, many docking programs require the protein to remain fixed, while allowing only the ligand to flex and adjust to the active site of the protein. Genetic algorithms also require multiple runs to obtain reliable answers regarding ligands that may bind to the protein. The time it takes to typically run a genetic algorithm in order to allow a proper pose may be longer, hence these methods may not be as efficient as shape complementarity-based approaches in screening large databases of compounds. Recent improvements in using grid-based evaluation of energies, limiting the exploration of the conformational changes at only local areas (active sites) of interest, and improved tabling methods have significantly enhanced the performance of genetic algorithms and made them suitable for virtual screening applications.

References

  1. ^ Halperin I, Ma B, Wolfson H, Nussinov R (June 2002). "Principles of docking: An overview of search algorithms and a guide to scoring functions". Proteins 47 (4): 409–443. doi:10.1002/prot.10115. PMID 12001221. 
  2. ^ Mustard D, Ritchie DW (August 2005). "Docking essential dynamics eigenstructures". Proteins 60 (2): 269–274. doi:10.1002/prot.20569. PMID 15981272. 
  3. ^ Shoichet BK, Stroud RM, Santi DV, Kuntz ID, Perry KM (March 1993). "Structure-based discovery of inhibitors of thymidylate synthase". Science 259 (5100): 1445–50. doi:10.1126/science.8451640. PMID 8451640. 
  4. ^ McGann MR, Almond HR, Nicholls A, Grant JA, Brown FK (January 2003). "Gaussian docking functions". Biopolymers 68 (1): 76–90. doi:10.1002/bip.10207. PMID 12579581. 
  5. ^ Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK, Shaw DE, Francis P, Shenkin PS (March 2004). "Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy". J. Med. Chem. 47 (7): 1739–1749. doi:10.1021/jm0306430. PMID 15027865. 
  6. ^ Jain AN (February 2003). "Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine". J. Med. Chem. 46 (4): 499–511. doi:10.1021/jm020406h. PMID 12570372. 
  7. ^ Zsoldos Z, Reid D, Simon A, Sadjad SB, Johnson AP (July 2007). "eHiTS: a new fast, exhaustive flexible ligand docking system". J. Mol. Graph. Model. 26 (1): 198–212. doi:10.1016/j.jmgm.2006.06.002. PMID 16860582. 
  8. ^ Jones G, Willett P, Glen RC, Leach AR, Taylor R (April 1997). "Development and validation of a genetic algorithm for flexible docking". J. Mol. Biol. 267 (3): 727–748. doi:10.1006/jmbi.1996.0897. PMID 9126849. 
  9. ^ Goodsell DS, Morris GM, Olson AJ (1996). "Automated docking of flexible ligands: applications of AutoDock". J. Mol. Recognit. 9 (1): 1–5. doi:10.1002/(SICI)1099-1352(199601)9:1<1::AID-JMR241>3.0.CO;2-6. PMID 8723313.